Method and apparatus for performing link aggregation

ABSTRACT

A network node or corresponding method of performing link aggregation reduces a number of Content Addressable Memory (CAM) entries required to make a forwarding decision for a given ingress flow, reducing cost, size, and power consumption of the CAM and accompanying static RAM. In one embodiment, an ingress flow is mapped to an egress flow identifier. Subsequently, the egress flow identifier is mapped to a member of an aggregated group associated with an egress interface based on information available in a given ingress flow. Finally, the given ingress flow is forwarded to the member of the aggregated group associated with the egress interface. A hashing technique or two lookups may be used alone or in combination in mapping the ingress flow to the egress flow identifier to reduce CAM memory usage.

RELATED APPLICATIONS

This application is a continuation-in-part of U.S. application Ser. No.11/605,829, filed Nov. 29, 2006, which is a continuation-in-part of U.S.application Ser. No. 11/447,692, filed Jun. 5, 2006, entitled “A Methodand Apparatus for Performing Link Aggregation,” now abandoned. Theentire teachings of the above applications are incorporated herein byreference.

BACKGROUND OF THE INVENTION

Link aggregation allows for the grouping of multiple physical links orports within a network node into a single aggregated interface.Aggregated interfaces can be used for increasing bandwidth of aninterface and for providing port level redundancy within an interface.An ingress interface on a line card residing in the network nodereceives flows including multiple packets and forwards these flows toport members of an aggregated group associated with an egress interface.Line cards may utilize Content Addressable Memory (CAM) to increase thespeed of link aggregation and minimize the effects of search latency.

CAM, however, is expensive and, together with static RAM or other logic,consumes a significant amount of power and takes up board space. Inaddition, the number of entries in the CAM used for link aggregationgreatly expands as the number of aggregated links increases. As aresult, the CAM has a limited number of entries for performing othernecessary and useful functions, including functions associated with amulti-service network node.

SUMMARY OF THE INVENTION

A network node or corresponding method in accordance with an embodimentof the present invention reduces a number of CAM entries required toperform link aggregation. In one embodiment, a first mapping unit maps agiven ingress flow to an egress flow identifier. A second mapping unit,in turn, maps the egress flow identifier to a member of an aggregatedgroup associated with an egress interface based on information availablein the given ingress flow. A flow forwarding unit forwards the giveningress flow to the member of the aggregated group associated with theegress interface.

BRIEF DESCRIPTION OF THE DRAWINGS

The foregoing will be apparent from the following more particulardescription of example embodiments of the invention, as illustrated inthe accompanying drawings in which like reference characters refer tothe same parts throughout the different views. The drawings are notnecessarily to scale, emphasis instead being placed upon illustratingembodiments of the present invention.

FIG. 1 is a network diagram of a portion of a communications networkemploying an embodiment of the present invention.

FIG. 2 is a block diagram of an example switch used in a communicationsnetwork.

FIG. 3 is a block diagram of a switch that includes an ingress line cardwith example components.

FIGS. 4-6 are block diagrams of a switch illustrating multipleoperations of the example components in an ingress line card accordingto embodiments of the present invention.

FIGS. 7-8 are block diagrams illustrating example components in a nodeof a communications network according to embodiments of the presentinvention.

FIGS. 9-11 are example flow diagrams performed by elements of acommunications system according to embodiments of the present invention.

DETAILED DESCRIPTION OF THE INVENTION

A description of example embodiments of the invention follows.

Typically, when a link aggregated interface of a multi-service switchreceives a given flow, it searches a lookup table to determine a portmember of an egress interface to which to forward the flow. The lookuptable is often programmed into Content Addressable Memory (CAM) becauseof its speed and flexibility in supporting multiple services. In thecase of layer 2 switched VLAN traffic, hundreds or thousands of VLANsmay be associated with the link aggregated interface. As a result, theCAM may need tens of thousands of entries. FIG. 1 illustrates a networkwith switches that use link aggregation. FIG. 2 illustrates an exampleswitch with line cards and a switch matrix supporting Layer 2 switchingthat may use CAM with a lookup table to support the switching. Anexample embodiment of the present invention illustrated in FIG. 3reduces the number of CAM entries needed by executing a hashingalgorithm on source and destination addresses of a packet in the givenflow. Example embodiments of the present invention illustrated in FIGS.4-11 further reduce the number of CAM entries needed by performing twodifferent successive lookups. The tradeoff for reducing the number ofCAM entries is increased latency because of the additional lookup. Thelatency, however, may be reduced by dividing the CAM into multiplecascaded CAMs and performing multiple lookups in parallel. For example,a CAM may be divided into four CAMs with each CAM dedicated to a portionof the VLANs supported by an ingress interface. FIGS. 1-11 are presentedin detail below, in turn.

FIG. 1 is a network diagram of a portion of a communications network 100employing an example embodiment of the present invention. This portionof the communications system 100 includes two switches 110, 120 (SwitchA and Switch B). Switch A 110 may include any number of ingress ports114 a, 114 b, . . ., 114 n (114 a-n), 118 a, 118 b, . . . , 118 n (118a-n), and so forth, connected through physical links 113 and 117,respectively, to other network nodes (not shown). In accordance with alink aggregation technique, Switch A 110 may logically bond togethergroups of the physical links 113, 117, connected to respective groups ofingress ports 114 a-n, 118 a-n, into link aggregation groups (LAGs) 112,116, respectively. In this way, the groups of physical links 113, 117appear as one logical link. The link aggregation groups 112, 116 may bemaintained according to a link aggregation configuration hierarchy thatincludes an aggregator (not shown) associated with each group of ingressports 114 a-n and 118 a-n. A logical interface (not shown) can be builton the aggregator with the associated physical ports being part of thelogical interface.

In an example embodiment, each link aggregation group 112, 116 has auniquely assigned Media Access Control (MAC) address and an identifier.This MAC address can be assigned from the MAC address of one of theports in a link aggregation group or from a pool of reserved MACaddresses not associated with any of the ports in the link aggregationgroup. The MAC address is used as a source address when transmitting andas a destination address when receiving.

Switch B 120 may similarly have any number of egress ports 124 a, 124 b,. . . , 124 n (124 a-n), 128 a, 128 b, . . ., 128 n (128 a-n), and soforth, connected through physical links 123, 127, respectively, to othernetwork nodes (not shown). As with Switch A 110, Switch B 120 maylogically bind together groups of the physical links 123, 127 connectedto respective groups of egress ports 124 a-n, 128 a-n, into respectivelink aggregation groups 122, 126. Switch A 110 may also have egressports 119 a, 119 b, 119 c, and 119 d connected through respectivephysical links 125 to ingress ports 129 a, 129 b, 129 c, and 129 d ofSwitch B 120. Both Switch A 110 and Switch B 120 may bind together thegroup of the physical links 125 connecting the two switches 110, 120into a link aggregation group 130.

A given flow, including any number of packets 111 a, 111 b, . . . , 111n (111 a-n), may be transmitted from another network node to Switch A110 via the physical link connected to ingress port 114 a. The givenflow may include multiple packets having the same source and destinationaddresses. Packets that are not members of the given flow may beinterspersed among packets (e.g., packets 111 a-n) that are members ofthe given flow.

Switch A 110 may transmit or forward the same or a different flow,including packets 131 a, 131 b, . . . , 131 n (131 a-n), to Switch B 120via at least one of the physical links 125 connecting Switch A's egressports 119 a-d to Switch B's ingress ports 129 a-d. Switch B 120, inturn, may transmit the same or a different flow, including packets 121a, 121 b, . . . , 121 n (121 a-n), to another network node via at leastone of the physical links 127 connected to Switch B's egress ports, suchas the lowermost port 128 n, as illustrated. In this manner, flows aretransmitted between nodes in the communications network 100 via a LabelSwitched Path (LSP) or other type of path, such as an Internet Protocol(IP) path.

The aggregator (not shown) may distribute received frames from a higherapplication to one of the links used by the aggregator. In addition, theaggregator may transmit received frames from one of the links on a linkaggregation group to a higher layer application in the order that theyare received.

The aggregator (not shown) may operate according to two modes:incremental bandwidth mode and link protection mode. In incrementalbandwidth mode, a user can increase or decrease the bandwidth ofinterfaces built on an aggregator by adding or deleting members to orfrom the link aggregation group. For example, a user may wish to upgradefrom a 100 Megabit fast Ethernet link without subscribing to a costlyGigabit fast Ethernet link. In incremental bandwidth mode, the user cantake two 100 Megabit fast Ethernet links and bond them together usinglink aggregation to get effectively 200 Megabits of bandwidth.

In link protection mode, an “active” member is the only member within anaggregator that can transmit, while all members of the aggregator canreceive. In link protection mode, the maximum bandwidth of an interfacethat is built on the aggregator is the bandwidth of a single member andnot the sum of all the members as in incremental bandwidth mode. Thus,the other members are reserved for future use in case the “active”member goes down.

FIG. 2 is a block diagram of an example switch 200 (Switch A) used in acommunications network. Switch A 210 may include multiple ingress linecards, such as ingress line cards A and B 232, 234, connected tomultiple egress line cards, such as egress line cards A and B 233, 235,via a switch fabric 240. A flow 209, including any number of packets 211a, 211 b, . . . , 211 n (211 a-n), may be transmitted to Switch A 210via a link member 213 of a link aggregation group 212 associated withingress line card A 232. In other embodiments, the ingress interface maynot be aggregated. The ingress line card A 232 determines theappropriate egress line card and egress line card port to forward theflow 209 and forwards the flow 209 via the switch fabric 240 to one ofthe egress line cards 233, 235. For example, a flow, including packets231 a, 231 b, . . . , 231 n (211 a-n) may be forwarded to a link ofanother link aggregation group 222.

FIG. 3 is a block diagram of a switch 300 that includes an ingress linecard 332 illustrating example components of the ingress line card 332.The switch 300 also includes a switch fabric 340 and an egress line card333. The ingress line card 332 includes a packet processor 330, logic336, a central processing unit (CPU) 334, and Content Addressable Memory(CAM) 335. The packet processor 330 connects to the logic 336 via abidirectional line 345. The logic 336 formats data from the result SRAM337 in a way that the packet processor 334 understands, and the logic336 formats data from the packet processor 334 in a way that the CAM 335understands. The logic 336 connects to Content Addressable Memory (CAM)335, and the CAM 335, in turn, connects to a result Static Random AccessMemory (SRAM) 337. The result SRAM 337 then connects back to the logic336. The logic 336 may be programmed into a Field Programmable GateArray (FPGA). The packet processor 330, via the logic 334, may accessinformation, such as keys 338 (shown as sets of numbers withinbrackets), that is organized and stored in the CAM 335.

In one example embodiment, the CAM 335 may have a maximum of 512,000entries that are 72 bits wide or 256,000 entries that are 144 bits wide.Each CAM entry may have a corresponding SRAM entry. Thus, in thisembodiment, the result SRAM 337 may have at least 512,000 or 256,000entries if the CAM has 512,000 or 256,000 entries, respectively. Theresult SRAM 337 may have 192-bit-wide entries to accommodate otherinformation besides an egress aggregate flow identifier and a flag.

The ingress line card 332 includes ingress ports 314 a, 314 b, 314 c,and 314 d (314 a-d). The ingress line card 332 may bond together theingress ports 314 a-d into an ingress link aggregation group 312. Theingress line card 332 connects through the switch fabric 340 to theegress line card 333 having egress ports 319 a, 319 b, 319 c, and 319 d(319 a-d). The egress line card 333 may also bond together the egressports 319 a-d into an egress link aggregation group 322. In otherembodiments, any number of the ingress ports 314 a-d and egress ports319 a-d may not be logically bound together into link aggregationgroups, such as the ingress and egress link aggregation groups 312, 322.

A network operator may provision (or signal) the Ingress Line Card 332with configuration settings using an embodiment of the presentinvention. For example, the network operator may enter configurationinformation for a customer using VLAN ID 10 on a given fast Ethernetinterface via an operator interface. In this manner, the networkoperator builds a circuit on the fast Ethernet interface of VLAN ID 10.The CPU 335 may then program the CAM 335, the result SRAM, and thepacket processor 334 via the logic 336. For example, the CPU 334 mayexecute a lower layer of software that programs the appropriate CAMentries (i.e., CAM keys and corresponding SRAM results) via the logic336. If implemented in software, the software may be stored on anycomputer readable medium (e.g., a removable storage medium such as oneor more DVD-ROM's, CD-ROM's, diskettes, tapes, etc.), known or laterdeveloped in the art, having stored thereon sequences of instructions,the sequences of instructions including instructions, when executed by adigital processor, that cause the processor to perform in manner as inembodiments of the present invention. The CPU 335 may also program thepacket processor 334 with microcode instructions to analyze a giveningress flow and access information from the CAM 335 and result SRAM 337in order to determine a link on which to forward a given ingress flow.The CPU 334 may further program the packet processor 330 with theencapsulation type of port 314 a of the ingress interface. In thisexample embodiment, the encapsulation type is layer 2 switched VLANtraffic.

After the CPU 334 programs the CAM 335, result SRAM 337, and packetprocessor 330, the packet processor 330 (1) receives a flow 309,including multiple packets 311 a, 311 b, . . . , 311 n, from one of theingress ports 314 a-n (in this example, port 314 a) of the ingress linecard 332, (2) builds a key 325 based on the contents of the flow 309,and (3) launches a CAM lookup with the key 325. The packet processor 330executes these functions because it can do so at a significantly greaterpacket rate than a CPU (e.g., more than 50 million packets per second).

For a layer 2 switched Virtual Local Area Network (VLAN) key type, thekey 325 may include four key parameters: an ingress interface identifier323 a, VLAN identifier 323 b, three-bit priority 323 c, and hash value323 d ({L2FlowId, VLAN, Priority, Hash}). The key 325 may includedifferent key parameters for key types, such as Internet Protocol (IP),Ethernet, or other non-VLAN key types. When the packet processor 330receives the flow 309 from the port 314 a of the ingress linkaggregation group 312, it populates the key's first entry 323 a with thelayer 2 flow identifier identifying the ingress interface (e.g.,“1000”). The packet processor 330 then looks at the flow's Ethernetheader (not shown) to make sure the packet headers are correct and toidentify the flow's VLAN tag or identifier that identifies the VLAN withwhich the flow 309 is associated. The packet processor 330 looks-up theVLAN identifier to determine on which interface to send out the flow 309and, optionally, swaps in a new VLAN identifier in place of the one theflow 309 had when the switch 310 received it.

The packet processor 330 also extracts the priority from a priorityfield, such as a three-bit priority field, in the VLAN header that isused to prioritize flows and to populate the priority key parameterfield 323 c with the priority (e.g., priority “0”). Finally, dependingon the flow type, the packet processor 330 extracts the source anddestination addresses from the flow's Ethernet headers and runs analgorithm on the source and destination addresses to calculate a hashvalue, such as a four-bit hash value. The packet processor 330 populatesthe hash key parameter field 323 d with this hash value. The hash valueindicates the specific egress port member of the egress link aggregationgroup 315 to forward the flow 309. Note that in a switch having anegress interface that is not link aggregated, the CAM keys may notinclude a hash field because there is no need for link aggregation.

Use of a hashing technique, such as the one described immediately above,decreases the size of a table (e.g., CAM and corresponding result SRAM)that must be indexed to determine the specific egress port member towhich to forward a flow. For example, a 48-bit source MAC addressrequires a table having 2⁴⁸ entries. But, by using a hashing technique,that 48-bit source MAC address may be compressed to a smaller number,such as a 10-bit number. Hashing produces duplicates or “collisions.”For example, subsets of 48-bit number variations compress to a same10-bit number. Thus, a table may have multiple entries at a certainindex that hash to the same value. As a result, hashing increases theefficiency of a lookup. In one embodiment, a hashing algorithmcompresses a 48-bit source MAC address and a 48-bit destination MACaddress to 4 bits. In other words, 96-bits of information are compressedto a 4-bit number. Thus, many combinations of source and destination MACaddresses can hash to the same 4-bit value. If there are a larger numberof flows, there is a better chance of getting an equal distributionacross all egress port members.

The hashing is typically random enough to provide some variance so thattraffic is distributed evenly across the links. The hashing may be a CRCor an exclusive-or (XOR) type of operation. Hashing may also beperformed on a per flow basis or on a per packet basis. With per-flowhashing, whether there are two flows or a thousand flows, if the flowsoriginate from the same source MAC address destined to the samedestination MAC address, they all hash to the same link because the samehashing operation is performed on each of the flows. Variance in thesource and destination MAC addresses of the flows causes distributionflows across multiple links. For example, if several flows originatefrom the same source MAC address, but they are destined to differentdestination MAC addresses, there is a greater probability that someflows will hash to a first link and some flows will hash to a secondlink.

A hashing operation may also be performed on individual packets, whichare distributed across different links based on the hashing, even ifthose packet are part of the same flow. However, individual packets mayarrive at a receiving side out of order. As a result, the receiving sidemust put individual packets back in order. This involves a significantamount of overhead and some protocols cannot handle packets that arriveout of order. Therefore, a hashing algorithm may be run on a per flowbasis, and the flows are distributed accordingly to ensure that packetsassociated with the same flow arrive in order.

In the example embodiment illustrated in FIG. 3, the two leastsignificant bits of the four-bit hash value identify the egress portmembers (319 a-d) of the egress link aggregation group 322. A hash valueof “00” identifies egress port member 319 a, a hash value of “01”identifies egress port member 319 b, a hash value of “10” identifiesegress port member 319 c, and a hash value of “11” identifies egressport member 319 d. In another embodiment, a four-bit hash value may beused that supports up to sixteen egress port members. Other numbers ofbits used for hash values support other numbers of egress port members.

After the packet processor populates the key 325 with key parameters 323a-d, it launches a lookup with the key 325. Specifically, this lookupcauses a search of keys 338 in the CAM 335 for a matching key. If thereis a match, the CAM 335 returns an address 341 that indexes anotherlookup table 338 in the result SRAM 337 that has the CAM result, whichmay include the egress port member identifier 343. The address 341 maybe an index or a pointer to some area in the result SRAM 337. The egressport member identifier 343 may include, for multiple egress line cards333, a destination egress line card identifier and an output connectionidentifier (OCID). The contents of the result SRAM 337 indexed by theCAM result are then provided to the packet processor 330. The packetprocessor 330 then forwards the flow 309 to the appropriate egress portmember (e.g., port member 319 d identified by hash value “11”) viaswitch fabric 340 based on the egress port member identifier 343.

In summary, in the above-described example embodiment of FIG. 3, flows(e.g., flow 309) from one VLAN identified by the number “10” (323 b) maycome into the packet processor 330 through an ingress port member (e.g.,port member 314 a) of the ingress link aggregation group 312. The egresslink aggregation group 322 of the egress line card 333 may include onlytwo active port members (e.g., port members 319 a, 319 b). In thisinstance, two CAM entries are used to allow incoming traffic flows tohash to the two port or link members 319 a, 319 b (e.g., {1000, 10, 0,x0} and {1000, 10, 0, x1}).

A given link can support multiple VLANs (i.e., “logical subinterfaces”).Thus, another VLAN (e.g., “11”) on the same ingress interface (e.g.,“1000” (323 a)) associated with the ingress link aggregation group 312may be forwarded to the same two port members 319 a, 319 b. In thiscase, another two CAM entries are used (e.g., {1000, 11, 0, x0} and{1000, 11, 0, x1}). If flows (e.g., flow 309) from four VLANs come intothe packet processor 330 and the egress link aggregation group 322includes four active port members 319 a-d, sixteen CAM entries (338) areused for each of the VLANs identified by the numbers 10-13.

Thus, the number of CAM entries is equal to the number of VLANs a userdesires to support multiplied by the number of aggregated egress linksor port members of the egress link aggregation group 322. For largenumbers of VLANs, many CAM entries are used. For example, the ingressline card 332 may support 4,000 VLANs, numbered 10 to 4009, and theegress line card 333 may include two aggregated egress links. In thiscase, 4,000×2=8,000 CAM entries that are used to service all possiblecombinations.

FIG. 4 is a block diagram of a switch 400 illustrating examplecomponents in an ingress line card 412 according to an embodiment of thepresent invention. In particular, FIG. 4 illustrates a new manner inwhich to set up the CAM entries to provide packet or flow distributionacross outgoing links, i.e., determine the outgoing links to directflows, as a function of the incoming flow. Like the switch 310 in FIG.3, a switch 410 includes an ingress line card 412, switch fabric 440,and an egress line card 433. The ingress line card 412 includes a packetprocessor 430, CPU 434, CAM 435, logic 436, and Result SRAM 437. The CAMmay be a Ternary CAM (TCAM) that has three possible lookups or choices:a binary 0, binary 1, or “Don't Care” (i.e., either a binary 0 or binary1).

The packet processor 430 receives a flow 409, including multiple packets411 a, 411 b, . . . , 411 n, through an ingress port 414. The packetprocessor 430 then builds a first key 421 formatted to hit a CAM entryand launches a first CAM lookup. The first key 421 includes three keyparameters. The first key parameter 451 a is a layer 2 flow identifier,which identifies the interface from where the flow 409 originated. Thesecond key parameter 451 b is a VLAN identifier which the packetprocessor 430 extracts from the header of the packets 411 a-n in theflow 409. The third key parameter 451 c is a priority which the packetprocessor 430 also extracts from the header of the packets 411 a-n inthe flow 409. The first key 421, however, does not include a hash keyparameter. Thus, the packet processor 430 does not extract source anddestination addresses, such as a MAC or IP address, from the flow 409and calculate a hash value when it builds the first key 421.

After the packet processor 430 builds the first key 421, it launches afirst lookup by sending the first key 421 to the CAM 435. The CAM 435searches a first lookup table 438 for a matching key and returns anaddress or first index 441 used to index the Result SRAM 437. Theinformation contained in an entry of the Result SRAM located at thefirst index 441 may be a first result 443 that includes an “aggregated”bit or flag and the egress aggregate flow identifier. The “aggregated”bit indicates to the packet processor 430 that it should launch a secondCAM lookup (also referred to as the “aggregated lookup”) because theegress interface associated with the VLAN ingress flow is aggregated.The egress aggregate flow identifier, for example, may be an 18-bitnumber. The packet processor 430 then builds a second key 423 formattedto hit another CAM entry.

The second key 423 includes four key parameters. The first key parameteris the flow type key parameter 453 a. The flow type key parameter 453 aidentifies what type of flow is being sent out on an aggregatedinterface, such as the egress line card 433. When the packet processor430 builds the second key 423, it knows the flow type of the flow 409from the first lookup. In one embodiment, the flow type key parameter453 a is used to distinguish between different forwarded flows that aretraversing the same egress aggregated interface. For example, if layer 2traffic and IP traffic are both traversing the same Resource ReservationProtocol (RSVP) Label-Switched Path (LSP), then the flow type keyparameter 453 a is used to distinguish the layer 2 flow from the IPflow. The ingress line card 412 and the egress line card 433 may receiveand send, respectively, multiple flows of different types. For example,the flows may include IP flows and layer 2 switched VLAN flows.

The second key parameter is the egress aggregate flow identifier 453 b.This parameter is a globally unique node- or router-wide flow identifierthat is allocated and associated with every egress logical flow that isbuilt on an aggregated interface. The second lookup identifies thetraffic characteristics of that flow. In an example implementation,different flows can be assigned by different types of trafficparameters. One flow may be a higher priority flow than another flow. Inpreferred embodiments, the flows do not interfere with another flow. Theway the different types of flows are identified may be through usingthis aggregate flow ID, and each may be given a certain type oftreatment.

The third key parameter is a miscellaneous key parameter 453 c. This keyparameter may provide additional information that is specific to theflow type 453 a and the egress aggregate flow identifier 453 b. Themiscellaneous key parameter 453 c is used to make a more qualifieddecision as to which Output Connection Identifier (OCID) to choose. Forexample, if an ingress LSP is built on an aggregated IP interface and aVirtual Private LAN Service (VPLS) Destination MAC (DMAC) forwardingdecision is made that returns the egress aggregate flow identifier ofthat LSP, then the second CAM lookup (i.e., the aggregate CAM lookup)may also need to take into account the VPLS instance identifier in orderto obtain the final OCID to be used for that LSP. In this embodiment,however, the miscellaneous key parameter 453 c is not used.

The last key parameter is the hash value 453 d, which is calculatedbased on the source and destination MAC addresses of the flow 409.

After the packet processor 430 builds the second key 423, it launches asecond CAM lookup by providing the second key 423 to the CAM 435. TheCAM 435 searches a second lookup table 439 for a key matching the secondkey 423 and provides an address or first index 441 used to index theResult SRAM 437. The contents of the Result SRAM 437 at the first index441 is a first result 443 which may include an egress port memberidentifier. The egress port member identifier may include, for multipleegress line cards (433), a destination egress line card identifieridentifying the egress line card to which to forward the flow 409, andan OCID identifying the port member of the egress line card to which toforward the flow 409. The packet processor 430 then forwards the flow409 to the appropriate egress port member (e.g., a port memberidentified by hash value “x1”) via the switch fabric 440.

In other words, a first lookup operation involves mapping an incomingflow that arrives on an incoming interface to an outgoing aggregatedflow identifier. In other embodiments, the first lookup operation mayinvolve mapping an {interface, flow} tuple to the outgoing aggregatedflow identifier. A second lookup operation involves mapping the outgoingaggregated flow identifier to an outgoing link member of the aggregatedgroup. In this embodiment, the outgoing aggregate flow identifier linksthe first lookup operation to the second lookup operation.

As described above, example embodiments of the present inventionre-organize the keys in the CAM so that the first lookup is independentof the hash value. It is the use of the hash value that requires asignificant number of CAM entries because each VLAN, for example, needsCAM entries corresponding to every possible hash value. The possiblehash values come up in the second lookup. The number of CAM entriesrequired by example embodiments is equal to about the number of ingressflows supported by an ingress interface plus the number of members ofthe aggregated group associated with the egress interface. For example,if 4,000 VLANs come in on the same ingress interface and they aredestined to the same egress aggregated interface which has two members(e.g., two ports of an aggregated group), then the ingress interfaceneeds 4,000+2=4,002 CAM entries. In comparison, for the single lookupembodiment (e.g., FIG. 3), the ingress interface needs 4,000×2=8000 CAMentries.

A switch may have multiple egress interfaces, each of which isaggregated and has eight members. In this case, the ingress interface ofthe double lookup embodiment (e.g., FIG. 4) needs 4,000+8=4,008 CAMentries, whereas the ingress interface of the single lookup embodimentuses 4,000×8=32,000 CAM entries. Thus, a primary advantage of the doublelookup embodiment is scalability. That is, fewer CAM entries are usedfor a greater number of flows. However, the number of CAM entries isreduced at the expense of having to do one more look up.

A switch is typically designed to minimize latency. If there is too muchlatency, packets take longer to get through the switch, and packets needto be buffered for a greater length of time. Embodiments of the presentinvention increase latency by performing two successive lookups insteadof increasing the number of CAMs entries. Adding CAM to a switch mayincrease the latency by a given number of clock cycles, but performing asecond lookup may increase latency, for example, by half the givennumber of clock cycles.

In a multi-service switch, increasing latency is better than increasingthe number of CAM entries because a larger number of new packets ofdifferent services may be supported. For example, switching or routingdevices employing embodiments of the present invention may support framerelay services, ATM services, Ethernet, GigaEthernet (GigE), IP, IPv6,MPLS, VLAN. These services, whether they involve switching or routing,each require CAM resources in order to perform the forwarding function.

Link aggregation is often implemented in pure layer 2 Ethernet switches.In this case, there is no concern about using up CAM resources. In fact,the switch may not use a CAM. For example, the switch may use adifferent data structure that is optimized strictly for layer 2Ethernet. But, a CAM is the most flexible hardware today in a switch orrouter that supports multiple service types.

Many CAMs only support serial lookups. For example, in a system withfour CAMs, a lookup operation involves searching each of the four CAMsone at a time until there is a match. However, a CAM may be designed tosupport parallel lookups in order to decrease the latency introduced byembodiments of the present invention. Thus, the first and second lookupsinvolve performing four parallel lookups in the four respective CAMs.

Other example flow types include port to port and IP. For port to portflows, the first key (or forwarding lookup key) includes a layer 2 flowidentifier. The result of the first key lookup includes (i) an inputconnection identifier, (ii) an “aggregated” bit indicating that theegress interface associated with the ingress flow is aggregated, and(iii) the egress aggregate flow identifier. The second key (or aggregatelookup key) includes a port key type parameter that identifies the newaggregate lookup table as a hash lookup for aggregated interfaces. Theresult of the second key lookup includes the OCID and a destinationegress line card identifier. The hash value for the second key iscalculated from the source and destination MAC addresses of a given portto port flow.

For IP flows, the first key includes a VPN identifier and a destinationIP address. The result of the first key lookup includes the “aggregated”bit and the egress aggregate flow identifier. The second key includes anIP destination key type parameter, the egress aggregate flow identifier,a miscellaneous key parameter, which may be a traffic class identifier,and the hash value. The result of the second key lookup includes theOCID and a destination egress line card identifier. The hash value forthe second key is calculated from the source and destination IPaddresses of a given IP flow.

FIG. 5 is a block diagram of a switch 500 illustrating examplecomponents in an ingress line card 512 according to another embodimentof the present invention. In particular, FIG. 5 illustrates anembodiment of the invention that uses two successive lookups as appliedto FIG. 3. A CAM 535 of FIG. 5 may include only eight CAM entries 538,539 as compared to the sixteen CAM entries 338 in the CAM 535 of FIG. 3.Like the switch 310 in FIG. 3, the switch 500 includes an ingress linecard 512, switch fabric 540, and an egress line card 533. The ingressline card 512 includes a packet processor 530, CPU 534, CAM 535, logic536, and Result SRAM 537. The CAM 535 includes four entries in a firstCAM lookup table 538 and four entries in a second CAM lookup table 539.

The packet processor 530 receives a flow 509, including multiple packets511 a, 511 b, . . . , 511 n, through ingress port 514 a. The packetprocessor 530 then builds a first key 521 formatted to hit a CAM entryin the first CAM lookup table 538. The first key 521 includes three keyparameters as described above with reference to FIG. 4. After the packetprocessor 530 builds the first key 521, it launches a first lookup bysending the first key 521 to the CAM 535. The CAM 535 searches the firstlookup table 538 for a matching key (e.g., a first CAM entry for thefirst CAM lookup) and returns an address or first index 541 used toindex the Result SRAM 537.

The information contained in an entry of the Result SRAM 537 located atthe first index 541 is a first result 543 that includes an inputconnection identifier (ICID) (e.g., 200), an “aggregated” (e.g., 1) bitindicating that the packet processor 530 should launch a second CAMlookup, and the egress aggregate flow identifier (e.g., 100). The packetprocessor 530 then builds a second key 523 formatted to hit a CAM entryin the second lookup table 539. To build the second key 523, the packetprocessor 530 calculates a hash value (e.g., “11”) based on the sourceand destination MAC addresses of the flow 509.

The second key 523 includes four key parameters as described above withreference to FIG. 4. After the packet processor 530 builds the secondkey 523, it launches a second lookup by sending the second key 523 tothe CAM 535. The CAM 535 searches a second lookup table 539 for amatching key (e.g., a fourth CAM entry in the second CAM lookup table539) and returns an address or second index 542 used to index the ResultSRAM 537. The information contained in an entry of the Result SRAM 537located at the second index 542 is a second result 545 that includes,for multiple egress line cards, a destination egress line card (e.g., 1)and an output connection identifier (OCID) (e.g., 303). The packetprocessor 530 then forwards the flow 509 to the appropriate egress portmember (e.g., port member 519 d (303) corresponding to hash value “11”)via the switch fabric 540.

FIG. 6 is a block diagram of a switch 600 illustrating examplecomponents in an ingress line card 612 according to another embodimentof the present invention. Like the switch 400 in FIG. 4, the switch 600includes an ingress line card 612, switch fabric 640, and egress linecard 633. The ingress line card 612 includes a packet processor 630, CPU634, CAM 635, logic 636, and Result SRAM 637. The CAM 635 includes oneentry in a first CAM lookup table 638 and two entries in a second CAMlookup table 639.

The packet processor 630 receives a flow 609, including multiple packets611 a, 611 b, . . . , 611 n, through a single ingress port 614. Thepacket processor 630 then builds a first key 621 formatted to hit a CAMentry in the first CAM lookup table 638. The first key 621 includesthree key parameters as described above in reference to FIG. 4. Afterthe packet processor 630 builds the first key 621, it launches a firstlookup by sending the first key 621 to the CAM 635. The CAM 635 searchesthe first lookup table 638 for a matching key (e.g., a first CAM entryin the first CAM lookup table 638) and returns an address or first index641 used to index the Result SRAM 637. The information contained in anentry of the Result SRAM 637 located at the first index 641 is a firstresult 643. The packet processor 630 then builds a second key 623 basedon the first result 643 and formatted to hit a CAM entry for the secondlookup 639. To build the second key 623, the packet processor 630calculates a hash value (e.g., “x1”) based on the source and destinationMAC addresses of the flow 609.

The second key 623 includes four key parameters as described above inreference to FIG. 4. After the packet processor 630 builds the secondkey 623, it launches a second lookup by sending the second key 623 tothe CAM 635. The CAM 635 then searches a second lookup table 639 for amatching key (e.g., a second CAM entry in the second CAM lookup table639). In this embodiment, the result 645 of the second lookupcorresponds directly to a port ID because the index value returned bythe CAM 635 self-identifies the port ID due to predetermined placementof data in the CAM 635. Thus, when the matching key is found, the CAM635 returns an egress port identifier 645, so there is no need in thisembodiment to pass the second index (i.e., port ID 645) through theResult SRAM 637. An advantage of this embodiment is decreased latencybecause the Result SRAM 637 is indexed once instead of twice. Moreover,less result SRAM 637 space is used because Result SRAM entriescorresponding to the entries in the second CAM lookup table 639 areeliminated. The packet processor 630 then forwards the flow 609 to theappropriate egress port member (e.g., the port member identified by hashvalue “x1”) via switch fabric 640.

FIG. 7 is a block diagram illustrating example components of a node 701in a communications network 700 according to one embodiment. The node701 includes an ingress interface 740 that receives a given ingress flow709, which may include multiple packets 711 a, 711 b, . . . , 711 n, ona first ingress link 713 a. The first ingress link 713 a may be a memberof a link aggregation group 712, which also includes a second ingresslink 713 b. A first mapping unit 742 maps the given ingress flow 709 toan egress flow identifier 743. A second mapping unit 744, in turn, mapsthe egress flow identifier 743 to an egress link member identifier 745based on information available in the given ingress flow 709. The egresslink member identifier 745 identifies an egress link (e.g., a firstegress link 723 a or a second egress link 723 b) to which to forward thegiven ingress flow 709. The egress links 723 a-b may be members of anaggregated group 722 associated with an egress interface 748. A flowforwarding unit 746 then forwards the given ingress flow 709 to theegress link member corresponding to the egress link member identifier745 (e.g., the second egress link member 723 b).

FIG. 8 is a block diagram illustrating example components of a node 801in a communications network 800 according to another embodiment. Thenode 801 includes an ingress interface 840 that receives a given ingressflow 809, which may include multiple packets 811 a, 811 b, . . . , 811n, on a first ingress link 813 a. The first ingress link 813 a may be amember of a link aggregation group 812, which also includes a secondingress link 813 b. The node 801 includes an identification unit 847that identifies parameters associated with the given ingress flow 809 toinclude in a first key 861 and a second key 862.

After the identification unit 847 or a first mapping unit 842 builds thefirst key 861, the first mapping unit 842 searches a first lookup table851 for a match of the first key 861. A linking unit 843 then links thesearch of the first lookup table 851 to a search of a second lookuptable 852. For example, the linking unit 843 may receive an index value863 from the first lookup table 581 and provide part of the second key862, such as an egress flow identifier 864, to a second mapping unit844. The linking unit 843 may include Static Random Access Memory (SRAM)having an entry addressed by the index value 863. The entry may includethe egress flow identifier 864. In this manner, the given ingress flow809 is mapped to the egress flow identifier 864.

The node 801 may also include a hashing unit 830 that hashes orcalculates a hash value 866 based on a unique identifier 865 availablein the given ingress flow 809. The unique identifier 865 may includesource and destination Media Access Control (MAC) addresses or sourceand destination Internet Protocol (IP) addresses. The second mappingunit 844 may build the second key 862 using the result 866 of thehashing unit 830, the result 866 of the linking unit 843, and other keyparameters 867 identified by the identification unit 847. The secondmapping unit 844 may then search the second lookup table 852 for a matchof the second key 862.

When the second mapping unit 844 finds a match, it may provide an egresslink member identifier 869 corresponding to the match to the trafficforwarding unit 846. In this manner, the second mapping unit 844 may mapthe egress flow identifier 864 to the egress link member identifier 869.The egress link member identifier 869 identifies an egress link (e.g., afirst egress link 823 a or a second egress link 823 b) to which toforward the given ingress flow 809. The egress links 823 a-b may bemembers of an aggregated group 822 associated with an egress interface848. The traffic forwarding unit 846 then forwards the given ingressflow 809 to the egress link member corresponding to the egress linkmember identifier 869 (e.g., the second egress link member 823 b).

FIG. 9 is an example flow diagram 900 performed by elements of acommunications system according to an embodiment of the presentinvention. After starting (901), a network node maps an ingressinterface to an egress flow identifier (902). The network node then mapsthe egress flow identifier to a member of an aggregated group associatedwith an egress interface based on information available in a giveningress flow (904). Finally, the network node forwards a given ingressflow to a member of the aggregated group associated with the egressinterface (906) and ends the above process (908).

FIG. 10 is another example flow diagram performed by elements of thecommunications system. After starting (1001), parameters of a first keyare identified for a given ingress flow (1002). A first look-up table issearched to find a match for the first key (1004). A key parameter isidentified based on an index value from the search of the first look-uptable (1006). Next, the second look-up table is searched to find asecond key that includes the key parameter (1008). The given ingressflow is forwarded to a member of an aggregated group associated with akey in the second look-up table matching the second key (1010). Theabove process 1000 then ends 1012.

FIG. 11 is an example flow diagram performed by elements of acommunications system 1100. After starting (1101), a first key isidentified from a given ingress flow (1102). A CAM is searched to find amatch for the first key and to obtain an index corresponding to thematching key (1104). An aggregated group identifier is obtained based onthe index (1106). The source and destination IP addresses of the giveningress flow are hashed to obtain a hash key parameter (1108). Next, theCAM is searched to find a match for a second key including the hash keyparameter and the aggregated group identifier (1110). Finally, the giveningress flow is forwarded to a member of an aggregated group associatedwith a key in the CAM matching the second entry (1112). The aboveprocess 1100 then ends (1114).

While this invention has been particularly shown and described withreferences to example embodiments thereof, it will be understood bythose skilled in the art that various changes in form and details may bemade therein without departing from the scope of the inventionencompassed by the appended claims.

The term “about” allows for any differences that are within the spiritand scope of the inventions described in the present specification.

It should be understood that the forwarding logic (i.e., packetprocessor, CAM, and so forth) may be implemented in a line card, amotherboard (containing the forwarding and switching logic on the sameprinted circuit board (PCB), or any other medium known to a personhaving ordinary skill in the art.

1. A method of performing link aggregation, comprising: forwarding a given ingress flow to an egress interface; and at least one of: prior to the forwarding, mapping the given ingress flow associated with a member of a first aggregated group to an ingress flow identifier based on information available in the given ingress flow; and prior to the forwarding, mapping the given ingress flow to an egress flow identifier and mapping the egress flow identifier to a member of a second aggregated group associated with the egress interface based on information available in the given ingress flow.
 2. The method according to claim 1 wherein mapping the egress flow identifier includes hashing a unique identifier available in the given ingress flow and using a result of the hashing in determining the member of the second aggregated group.
 3. The method according to claim 2 wherein the unique identifier includes source and destination Media Access Control (MAC) addresses, or source and destination Internet Protocol (IP) addresses, or a Multiprotocol Label Switching (MPLS) label.
 4. The method according to claim 1 wherein (i) mapping the given ingress flow includes searching for a match of a first key in a first lookup table and (ii) mapping the egress flow identifier to the member of the second aggregated group includes searching for a match of a second key in a second lookup table.
 5. The method according to claim 4 wherein mapping the given ingress flow provides at least part of the second key for searching the second lookup table.
 6. The method according to claim 4 wherein searching for a match of the first key results in an index value and mapping the given ingress flow further includes identifying the egress flow identifier based on the index value.
 7. The method according to claim 4 wherein searching for a match of the second key results in an index value and mapping the egress flow identifier further includes identifying the member of the second aggregated group associated with the egress interface based on the index value.
 8. The method according to claim 4 further comprising identifying at least one parameter of the first key or the second key, the parameter being associated with the given ingress flow.
 9. The method according to claim 4 wherein a number of entries in the first and second lookup tables combined is equal to a number of ingress flows supported by the ingress interface plus a number of members of the second aggregated group
 10. The method according to claim 4 wherein searching the first and second lookup tables includes accessing Content Addressable Memory (CAM).
 11. A node, comprising: a flow forwarding unit configured to forward a given ingress flow to an egress interface; and at least one of: an ingress mapping unit configured to map, prior to the forwarding, the given ingress flow associated with a member of a first aggregated group to an ingress flow identifier based on information available in the given ingress flow; and a first egress mapping unit configured to map, prior to forwarding, the given ingress flow to an egress flow identifier and a second egress mapping unit configured to map the egress flow identifier to a member of a second aggregated group associated with the egress interface based on information available in the given ingress flow.
 12. The node according to claim 11 further comprising a hashing unit configured to hash a unique identifier available in the given ingress flow, wherein the second egress mapping unit uses the result of the hashing unit to determine the member of the second aggregated group.
 13. The node according to claim 12 wherein the unique identifier includes source and destination Media Access Control (MAC) addresses, or source and destination Internet Protocol (IP) addresses, or a Multiprotocol Label Switching (MPLS) label.
 14. The node according to claim 11 wherein (i) the first egress mapping unit searches a first lookup table for a match of a first key and (ii) the second egress mapping unit is configured to map the egress flow identifier to the member of the second aggregated group and the second egress mapping unit searches a second lookup table for a match of a second key.
 15. The node according to claim 14 further comprising a linking unit that links the search of the first lookup table to the search of the second lookup table.
 16. The node according to claim 15 wherein the linking unit receives an index value from the first lookup table and provides at least part of the second key.
 17. The node according to claim 15 wherein the linking unit is Static Random Access Memory (SRAM) having entries addressed by the index value.
 18. The node according to claim 14 further comprising an identification unit configured to identify at least one parameter of the first key or the second key, the parameter being associated with the given ingress flow.
 19. The node according to claim 14 wherein the number of entries in the first and second lookup tables combined is equal to a number of ingress flows supported by the ingress interface plus the number of members of the second aggregated group.
 20. The node according to claim 14 wherein the first and second lookup tables are Content Addressable Memory (CAM).
 21. A computer-readable medium having stored thereon sequences of instructions, the sequences of instructions including instructions, when executed by a digital processor, that cause the processor to: forward a given ingress flow to an egress interface; and at least one of: prior to the forwarding, map the given ingress flow associated with a member of a first aggregated group to an ingress flow identifier based on information available in the given ingress flow; and prior to forwarding, map the given ingress flow to an egress flow identifier and map the egress flow identifier to a member of a second aggregated group associated with the egress interface based on information available in the given ingress flow.
 22. The method according to claim 1, wherein the first aggregated group includes one or more ingress ports and the second aggregated group includes one or more egress ports.
 23. The method according to claim 1, wherein the forwarding is based upon hashing on one or more labels or one or more Internet Protocol (IP) addresses or one or more Media Access Control (MAC) addresses.
 24. The method according to claim 1, wherein a user may modify, add, or remove at least one of: one or more members of the first aggregate group and one or more members of the second aggregate group.
 25. The method according to claim 1, wherein the forwarding includes sending the given ingress flow to at least one member of the second aggregated group.
 26. The method according to claim 3 wherein the source and destination IP addresses include at least one of: IP version 4 (IPv4) addresses and IP version 6 (IPv6) addresses.
 27. The method according to claim 1, further comprising: prior to the forwarding, mapping the given ingress flow associated with the member of the first aggregated group to the ingress flow identifier based on information available in the given ingress flow.
 28. The method according to claim 1, further comprising: prior to the forwarding, mapping the given ingress flow to the egress flow identifier and mapping the egress flow identifier to the member of the second aggregated group associated with the egress interface based on information available in the given ingress flow.
 29. The method according to claim 1 wherein mapping the given ingress flow includes searching for a match of a first key in a lookup table and mapping the member of the first aggregated group to the ingress flow identifier.
 30. The method according to claim 29 wherein searching for a match of the first key results in an indexed value which is identical for members of the first aggregated group and mapping the ingress flow includes further identifying the egress interface based on the index value.
 31. The method according to claim 29 wherein searching for a match of the first key results in an indexed value which is different for members of the first aggregated group and mapping the ingress flow includes further identifying the egress interface based on the index value.
 32. The method according to claim 29 wherein searching the lookup table includes accessing Content Addressable Memory (CAM).
 33. The node according to claim 11 wherein a mapping unit is configured to map the given ingress flow including searching for a match of a first key in a lookup table and mapping the member of the first aggregated group to the ingress flow identifier.
 34. The node according to claim 33 further comprising a linking unit that links search results of the lookup table for the member of the first aggregated group to subsequent lookups.
 35. The node according to claim 34 wherein the linking unit receives an index value from the lookup table and provides at least part of a subsequent key.
 36. The node according to claim 34 wherein the linking unit is Static Random Access Memory (SRAM) having entries addressed by the index value.
 37. The method according to claim 1 wherein the forwarding includes sending the given ingress flow associated with the first aggregated group to either an egress interface associated with the second aggregated group or to another egress interface. 